home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Collection of Tools & Utilities
/
Collection of Tools and Utilities.iso
/
asmutil
/
a86v400.zip
/
A10.DOC
< prev
next >
Wrap
Text File
|
1994-12-21
|
33KB
|
689 lines
CHAPTER 10 RELOCATION AND LINKAGE
A86 allows you to produce either .COM files, which can be run
immediately as standalone programs, or .OBJ files, to be fed to
the MS-DOS LINK program. In this chapter I'll discuss .OBJ mode
of A86.
.OBJ Production Made Easy
I'll start by giving you the minimum amount of information you
need to know to produce .OBJ files. If you are writing short
interface routines, and do not want to concern yourself with the
esoterica of .OBJ files (segments, groups, publics, etc.), you
can survive quite nicely by reading only this section.
There are two ways you can cause A86 to produce a .OBJ file as
its object output. One way is to explicitly give .OBJ as the
output file name: for example, you can assemble the source file
FOO.8 by giving the command "A86 FOO.8 FOO.OBJ". The other way
is to specify the switch +O (letter O not digit 0). This is
illustrated by the invocation "A86 +O FOO.8", which will have the
same effect as the first invocation.
My design philosophy for .OBJ production is to accommodate two
types of user. The first type of user is writing new code, to
link to other (usually high level language) modules. That person
should be able to write the module with a minimum of red tape,
and have A86 do the right thing. The second type of user has
existing modules written for Intel/IBM assemblers, and wants to
port them to A86. A86 should recognize and act upon all the
relocation directives (SEGMENT, GROUP, PUBLIC, EXTRN, NAME, END)
given. The assembly should work even if several files, assembled
separately under the Intel/IBM assembler, are fed to a single A86
assembly. You'll see if you read on through this entire chapter
that the multiple-files requirement causes A86 to interpret some
of the relocation directives a little differently (while
achieving compatible results).
Let's suppose you're writing new code: for example, an interface
routine to the "C" language, that multiplies a 16-bit number by
10. "C" pushes the input number onto the stack, before calling
your routine. Your code needs to get the number, multiply it by
10, and return the answer in the AX register. You can code it:
_MUL10: ; "C" expects all public names to start with "_"
PUSH BP ; "C" expects BP to be preserved
MOV BP,SP ; we use BP to address the stack
MOV AX,[BP+4] ; fetch the number N, beyond BP and the ret addr
ADD AX,AX ; 2N
MOV BX,AX ; 2N is saved in BX
ADD AX,AX ; 4N
ADD AX,AX ; 8N
ADD AX,BX ; 8N + 2N = 10N
POP BP ; BP is restored
RET ; go back to caller
10-2
These 11 lines can be your entire source file! If you name the
file MUL10.8, A86 will create an object file MUL10.OBJ, that
conforms to the standard SMALL model of computation for high
level languages. If you use RETF instead of RET (thus, by the
way, getting the operand from BP+6 instead of BP+4), the object
module will conform to the standard LARGE model of computation.
All the red tape information required by the high level language
is provided implicitly by A86. I'll go through this information
in detail later, but you should need to read about it only if
you're curious.
What happens if you need to access symbols outside the module
you're assembling? If the type of the symbol is correctly
guessed from the instruction that refers to it, then you can
simply refer to it, and leave it undefined within the module. For
example, if A86 sees the instruction CALL PRINT with PRINT
undefined, it will assume that PRINT is a NEAR procedure. If
PRINT is never defined within the module, A86 will act as if you
declared PRINT via the directive EXTRN PRINT:NEAR. The address
of PRINT will be plugged into your instruction by LINK when it
combines A86's .OBJ file with the high level language's .OBJ
files, to make the final program.
In general, the undefined operand to any CALL or JMP instruction
is assumed to be NEAR. The second (source) operand to a MOV or
arithmetic instruction is assumed to be ABS (i.e., an immediate
constant). An undefined first (destination) operand is assumed
to be a simple memory variable, of the same size (BYTE or WORD)
as the register given in the second operand. If your external
symbol does not comply with these guidelines, you need to declare
it with an EXTRN before you use it. (You can also use EXTRN to
declare types of non-complying forward references within your
module, as you'll see later.)
If you'd like to link the MUL10 procedure to Turbo Pascal V4.0 or
later, you need to append the line CODE SEGMENT PUBLIC to the top
of the program, to name the program segment according to Turbo
Pascal's expectations. You may dispense with the leading
underscore in the name MUL10-- Turbo Pascal does not require or
expect it.
At this point, if you're a casual user, I think you've read
enough to get going! Read further only if you wish; or if you
get stuck, and need to master the esoterica.
10-3
Overview of Relocation and Linkage
When you assemble a program directly into a .COM file, the
program has just two forms: the source program, that you can
understand, and the .COM file, that the computer can "understand"
(i.e., execute). A .OBJ file is an intermediate format: neither
you nor the (executing) computer can make sense out of a .OBJ
file; only programs like LINK interpret .OBJ files. The purpose
of a .OBJ file is to allow you to assemble or compile just a part
of a program. The other parts (also in the form of .OBJ files)
can be produced at a different time; often by a different
assembler or compiler, whose source files are in a different
language. It's easy to see where the word "linkage" comes from:
the LINK program puts the pieces of a program together. The
"relocation" comes because the assembler or compiler that makes a
given program piece doesn't know how many other pieces will come
before it, or how big the other pieces will be. Each piece is
constructed as if it started at location 0 within the program;
then LINK "relocates" the piece to its true location.
Many of the relocation features of 86 assembly language are
couched in terms of LINK's point of view, so we must look at the
way LINK sees things. LINK calls a .OBJ file an "object module",
or just "module". Each module has a NAME, that can be referred
to when LINK issues diagnostic messages, such as error messages
and symbol maps. If a program symbol is used only within a
single module, it does not need to be given to LINK, except
possibly to pass along to a symbolic debugger. On the other
hand, if a program symbol is defined in one module and referenced
in other modules, then LINK needs to know the name of the symbol,
so it can resolve the references. Such a symbol is PUBLIC in the
module in which it is defined; it is "external" in the other
modules, containing references to it. Finally, exactly one
module in a program must contain the starting location for the
program; that module is called the "main module", and it must
supply the starting address (which is not necessarily at the
beginning of the module).
In the 86 family of microprocessors, the LINK system also does
much to manage the memory segments that a program will fit into,
and get its data from. The (grotesquely ornate) level of support
for segmentation was dictated by Intel, when it specified (and
IBM and the compiler makers accepted) the format that .OBJ files
will have. I attended the fateful meeting at Intel, in which the
crucial design decisions were made. I regret to say that I sat
quietly, while engineers more senior than I applied their fertile
imaginations to construct fanciful scenarios which they felt had
to be supported by LINK. Let's now review the resulting
segmentation model.
10-4
The parts of a program, as viewed by LINK, come in three
different sizes: they